Metabolic and genetic risk factors associated with pre-diabetes and type 2 diabetes in Thai healthcare employees: A long-term study from the Siriraj Health (SIH) cohort study

Background The study of non-communicable diseases (NCDs) in a developing country like Thailand has rarely been conducted in long-term cohorts, especially among the working-age population. We aim to assess the prevalence and incidence of risk factors and their associations underlying NCDs, especially type-2 diabetes mellitus (T2DM) among healthcare workers enrolled in the Siriraj Health (SIH) study cohort. Methods The SIH study was designed as a longitudinal cohort and conducted at Siriraj hospital, Thailand. A total of 5,011 participants (77% women) were recruited and follow-up. Physical examinations, blood biochemical analyses, family history assessments, behavior evaluations, and genetics factors were assessed. Results The average age was 35.44±8.24 years and 51% of participants were overweight and obese. We observed that men were more likely to have a prevalence of T2DM and dyslipidemia (DLP) compared to women. Aging was significantly associated with pre-diabetes and T2DM (P<0.001). Additionally, aging, metabolic syndrome, and elevated triglycerides were associated with the development of pre-diabetes and T2DM. The minor T allele of the rs7903146(C/T) and rs4506565 (A/T) were associated with a high risk of developing pre-diabetes with odds ratios of 2.74 (95% confidence interval [CI]: 0.32–23.3) and 2.71 (95% CI: 0.32–23.07), respectively; however, these associations were statistically insignificant (P>0.05). Conclusion The findings of the SIH study provide a comprehensive understanding of the health status, risk factors, and genetic factors related to T2DM in a specific working population and highlight areas for further research and intervention to address the growing burden of T2DM and NCDs.


Introduction
Non-communicable diseases (NCDs) represent a global health crisis, driven by factors such as rapid urbanization, unhealthy diets, physical inactivity, and an aging population [1][2][3][4].Metabolic risk factors, including high blood glucose, dyslipidemia (DLP), and obesity, contribute to 41 million NCD-related deaths annually, primarily among those aged 30 to 69 years [5][6][7].Thailand, an upper middle-income country (UMIC), has seen a substantial rise in its older adult population from 10.7% in 2007 to 19.2% in 2020, with a projected surge to 23% by 2030, aligning with the broader trends observed in the Southeastern Asia region (12.2 to 16.7% in 2020; and 15.7 to 20% in 2030, respectively) [8,9].In 2018, half of Thailand's population resided in urban areas, and the United Nations predicts that 68% of the global population will be urban dwellers by 2050 [10].This demographic shift poses substantial challenges to healthcare and sustainable development.Recent data from the Thailand National Health Examination Survey showed that nearly 30% of Thai adults over 40 exhibit dysglycemia, mainly concentrated in urban areas [11].Obesity, hypertension (HTN), and type-2 diabetes mellitus (T2DM), NCD risk factors, have surged in Thailand in the last decade [12][13][14].
Promoting healthy diets and physical activity is vital to mitigate NCDs across the lifespan [15,16].Maintaining good glycemic control can prevent microvascular complications [17].Addressing modifiable risk factors early and empowering younger populations with relevant knowledge and interventions are crucial steps.Urbanization and mobility bring both challenges and opportunities for research.A quarter of Thai Open University students moved from rural to urban areas from 2005 to 2009, impacting their health behaviours [18].The Electricity Generating Authority of Thailand (EGAT) cohort study found diverse socioeconomic backgrounds, a wealthier profile, higher male participation in smoking and alcohol, and an upper-class healthcare scheme [19].Longitudinal urban cohort studies focused on NCDs are essential.
Therefore, we established a long-term cohort database at a Thai urban medical university, implementing rigorous data quality control.A biobank has been established to store specimens for the assessment of genetic factors pertinent to NCD development, especially T2DM.The TCF7L2 gene plays a crucial role in pancreatic β-cell proliferation and insulin secretion regulation.Changes in this pathway can lead to T2DM [20].A common single nucleotide polymorphism (SNP) in the TCF7L2 gene region is associated with T2DM [21].In Asian populations, including Japanese [22], Thai [23], and Chinese [24], variants of the TCF7L2 gene, such as rs7903146, rs11196205, and rs12255372, have been identified as significant genetic risk factors for T2DM.These findings highlight the genetic heterogeneity of T2DM across different ethnic groups and underscore the importance of understanding population-specific genetic determinants of the disease.Our study aims to explore NCD risk factors, biomarker relationships, and develop a T2DM risk prediction model while investigating the association between T2DM and genetic variants of the TCF7L2 gene.

Study population
The Siriraj Health (SIH) study, conducted at Siriraj Hospital, Mahidol University, Bangkok, Thailand (Fig 1 ), is a comprehensive longitudinal cohort comprising diverse healthcare professionals (doctors, nurses, pharmacists, medical technicians etc.), support staff (drivers, engineers, security officers, clerks etc.), and academic personnel (lecturers, researchers, research assistants etc.).As of November 2023, there were 20,967 individuals working in this university hospital and approximately 80% undergo annual health check-up.The inclusion criteria of the study were the Siriraj Hospital personnel who attended an annual health screening surveillance program.The exclusion criteria of the study were individuals who withheld treatment information, unable to consistently follow-up, unable to participate in this cohort in the next 2 years, or presence of contraindications for blood sampling, such as blood clotting.SIH provides a platform for in-depth studies of extensive, long-term data, and biological specimens from Siriraj Hospital personnel based on an annual health screening surveillance program.This study estimated the sample size at 5,000 individuals based on NCD prevalence with a 95% confidence interval (CI) and standardized for age and sex according to the total population of the study [11].Between September 2017 and December 2019, 4,518 participants aged 18 to 55 years were enrolled in phase 1 (SIH1).Beginning in January 2020, the second phase of the Siriraj Health (SIH2) study included participants aged over 18 years with no upper age limit.However, due to the COVID-19 pandemic from 2020 to 2022, only 493 participants were enrolled.A total of 5,011 participants are illustrated in Fig 1 .The follow-up data from health check-up in 2020 of SIH1's participants were retrieved from electronic medical records in the hospital database.Out of the 4,518 people enrolled in the first cohort (SIH1), 484 participants (10.7%) were lost to follow-up in 2020 due to death, changes of a workplace, and loss of contact.

Data collection
During the annual health check-up, Siriraj's personnel were provided with initial information about the SIH cohort study.Written informed consent was obtained from all participants before enrollment and specimen collection.The SIH study was conducted in accordance with the principles established by the Declaration of Helsinki and was approved by the Ethics Committee of the Human Research Protection Unit, Faculty of Medicine Siriraj Hospital, Mahidol University board (COA no.Si 647/2016).We provided instructions on the study's standard operating procedures (SOPs) to a team of ten well-trained research staff members involved in the SIH study.These staff members were responsible for securing informed consent, conducting face-to-face questionnaire interviews, and collecting biological specimens.This collection took place within a well-equipped laboratory equipped with a standard biobank for preserving deoxyribonucleic acid (DNA) and plasma samples.A decoded identification number was generated for each participant, which was used to label all biological specimens and subject data.The management of study data was entrusted to a bioinformatician utilizing the R program (version 4.1.3,Revolution Analytics, Dallas, TX, USA) and Research Electronic Data Capture (REDCap, version 12.0.13,Vanderbilt, TN, USA) [25].The computer servers were located at the Siriraj Informatics and Data Innovation Center (SiData+), part of the Faculty of Medicine Siriraj Hospital, Mahidol University.

Physical examinations
Trained nurses from the Department of Preventive and Social Medicine, Faculty of Medicine Siriraj Hospital, Mahidol University, interacted with participants using established standard procedures.Data measurements included weight, height, waist circumference (WC), and blood pressure (BP).Weight measurements were obtained to the nearest 0.1 kg (Tanita BWB-800, Tanita Corporation, Tokyo, Japan), while standing height was measured to the closest 0.1 cm.WC was determined using a flexible, non-stretchable plastic tape positioned across the midpoint between the lowest rib and the upper lateral border of the right iliac crest.Systolic blood pressure (SBP) and diastolic blood pressure (DBP) measurements were conducted using a digital blood pressure monitor (HEM-907, Omron Corporation, Tokyo, Japan).

Laboratory measurements
A 34 ml venous blood sample was collected after a 12-hour fast, and 30 ml of urine was collected from each participant.To prevent the occurrence of hypoglycemic events, all blood samples were collected between 07:00 and 09:00 A.M. Blood samples were distributed into four anticoagulation tubes for complete blood count (CBC), glycated hemoglobin (HbA1c) analysis, plasma separation, and DNA extraction.A separate sodium fluoride tube inhibited glycolysis for fasting blood glucose (FBG) analysis, while a lithium heparin tube facilitated total cholesterol (TC), triglyceride (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), creatinine (Cr), and estimated glomerular filtration rate (eGFR) assessment.Urine samples were analyzed for spot urine albumin-to-creatinine ratio (MAU/Cr).Blood samples were centrifuged at 3,000 rounds per minute for 10 minutes at 4˚C, and plasma samples were aliquoted into cryotubes (Thermo Fisher Scientific, Jiangsu, China).Biochemical assays were performed using automated systems with various methods by an accredited clinical laboratory (Siriraj hospital, Thailand) as shown in S1 Table.We used the criteria of metabolic syndrome (MetS) in adults by the IDF definition with South Asian ethnic group [26].According to the IDF definition, MetS is present if three or more of the following five criteria are met: WC � 90 cm (men) or � 80 cm (women), BP � 130/85 mmHg, TG level � 150 mg/dL, HDL-C level < 40 mg/dL (men) or < 50 mg/dL (women) and FBG � 100 mg/dL.

DNA extraction and genotyping
Human DNA was extracted from 4 ml of whole venous blood using the Chemagic 360 automated platform and Chemagic DNA blood kits (CMG-1074, PerkinElmer, Baesweiler, Germany), following the manufacturer's instructions.DNA concentration and a purity ratio (within the acceptable in range 1.8-1.9)were determined using FLUOstar Omega (software version 5.5 R4, BMG LABTECH, Ortenberg, Germany).DNA samples were stored at -80˚C in the Siriraj Biobank.Genotyping involved the analysis of 3,960 DNA samples for SNPs using Infinium Asian Screening Array (ASA, Illumina, San Diego, CA, USA).The basic microarray technical data for ASA were downloaded from Illumina official website (http://www.illumina.com).An extensive analysis of 659,184 SNPs was analyzed from each individual.Before conducting the genotyping analysis, we assessed the quality call rate for both samples (97% cutoff) and SNPs (90% cutoff), excluding those with call rates below the specified thresholds.Additionally, a comparison was made between the reported sex based on demographic data and the sex predicted from genotypes.Genotyping analysis was performed using the PLINK program, version 1.9 (S1 File).

Self-report underlying diseases and family histories
Participants completed questionnaires about underlying diseases (T2DM, HTN, DLP, ischemic heart, stroke, and gout) and family history (heart failure, cancer, and stroke).HTN was defined as a SBP and DBP above the threshold (120/80 mmHg) or the current use of antihypertensive medication [27].The diagnostic HbA1c criteria for non-diabetes (non-DM) was < 5.7%, pre-diabetes (pre-DM) 5.7-6.4%, and diabetes mellitus (DM) > 6.4%, or the current use of hypoglycemic medication [28].Individuals who reported that they did not have DM but exhibited abnormal biochemical measurements within range of pre-DM were classified as unaware pre-DM [29] (S1 Fig).

Behavioral risk factors
Participants completed questionnaires about exercise, smoking, and alcohol consumption.Regular exercise was defined as an exercise at least once a week.Smoking was defined as either a former smoker or a current smoker.Alcohol drinking habits were defined as individuals who consumed alcohol at least once a month in the past year.

Statistical analysis
All continuous variables in the demographic data were expressed as the mean ± standard deviation (SD), whereas binary variables were expressed as numbers and percentages.Self-report data were expressed as percentage with 95% CI.All statistical analyses were conducted using STATA version 14 (STATA Corp., Texas).The unadjusted cross-sectional associations of each risk factor at baseline with non-DM, pre-DM, and T2DM were examined using Chi-squared (χ 2 ) tests.Adjusted odds ratios (OR) and 95% CI for cross-sectional associations at baseline were calculated using multivariable binary logistic regression models.Separate models were used for pre-DM versus non-DM and T2DM versus non-DM.To assess the longitudinal association between risk factors and progression to pre-DM or T2DM at baseline, the sample was restricted to individuals without T2DM at baseline.Multinomial logistic regression models were then used to calculate adjusted relative risk ratios (RRR) and 95% CI for T2DM-related outcomes (non-DM, pre-DM, T2DM) at follow-up.The average predicted probabilities of being classified as having normal blood glucose, pre-DM, or T2DM, given specific characteristics (each risk factor of interest for longitudinal analyses), were obtained using the Stata margins command.In this study, we focused on seven SNPs of the TCF7L2 gene, including rs7903146 (C/T), rs12255372 (G/T), rs7917983 (C/T), rs4506565 (A/T), rs4132670 (C/T), rs12243326 (C/T), and rs290487 (C/T).Genotype frequencies were tested for Hardy-Weinberg equilibrium (HWE) using the HWE Institute of Human Genetics calculator (https://ihg.gsf.de/ihg/index_engl.html) for both cases and controls through the Pearson χ 2 test.All SNPs were analyzed using a codominant, dominant, and recessive model to examine the relationships between the two groups (T2DM and non-DM) using the "SNPassoc" package of the R version 3.3.1 (R Foundation, Vienna, Austria).All statistical tests performed in this study were two-tailed, and a significance level of P<0.05 was considered statistically significant.

Results
Among all 5,011 participants, predominantly women (3,854 [77%]) with an average age of 35.44 ± 8.24 years, the study revealed that 51% of participants were overweight or obese (Table 1).Notably, men showed a significantly higher prevalence of obesity, abdominal obesity, and HTN compared to women.Men were also more likely to have an incidence of T2DM (elevated FBG and/or HbA1c) and DLP (high TC, TG, and LDL-C, but low HDL-C) compared to women (P<0.001).
A total of 1,464 participants (29.2%) met the criteria for MetS, with a higher prevalence among men (544 [47.0%]) than women (920 [23.9%]).Furthermore, 2,540 participants (50.6%) reported regular exercise (at least once a week), and 3,168 (63.2%) reported monthly alcohol consumption.During follow-up, data from 4,038 participants showed significant increases in body mass index (BMI), WC, SBP, DBP, FBG, MetS, and LDL-C levels, particularly among women.Meanwhile, eGFR and Cr levels declined over time.We primarily collected follow-up data through routine electronic health check-ups, limiting access to key information like HbA1c levels, underlying diseases, family history, and lifestyle factors.Due to the nature of these check-ups, we couldn't gather extra data beyond what's routinely collected in healthcare.While participants were physically present, data collection was restricted by standard procedures during these visits.
The distribution of genotypes and allelic frequencies for these TCF7L2 gene polymorphisms was presented in Table 5 and S2 Table in S2 File.Notably, the minor T allele of both rs7903146 (C/T) and rs4506565 (A/T) SNPs showed a trend toward increased risk for developing pre-DM, with odds ratios of 2.74 (95% CI: 0.32-23.3)and 2.71 (95% CI: 0.32-23.07),respectively.However, these associations did not reach statistical significance (P>0.05).Five SNPs (rs12243326, rs12255372, rs4132670, rs4506565, and rs7903146) did not exhibit significant associations with pre-DM and T2DM (S3 Table in S2 File).Additionally, after adjusting for age, it was found that men with the TT genotype of rs7917983 exhibited a 2.81-fold increased risk for T2DM compared to women with the same genotype.Moreover, women with the TT genotype of rs290487 showed a 2.27-fold increased risk for diabetes compared to men with the TT genotype as shown in S3

Discussion
The SIH study distinguishes itself as a comprehensive, population-based cohort encompassing a diverse spectrum of hospital personnel, comprising healthcare practitioners, supporting staff, and academic faculty.An intriguing aspect pertains to the pronounced disparity in the sex composition of this cohort, with women constituting 77% of the participants.The longitudinal framework of this cohort offers a unique vantage point for monitoring health dynamics and disease progression over time.Among the 5,011 individuals drawn from the personnel of Siriraj Hospital, we achieved a remarkable 80.5% follow-up completion rate.Our findings illuminate the   prevalence of critical metabolic risk factors within this working population, including obesity, abdominal obesity, HTN, T2DM, DLP, and MetS.These revelations underscore the pressing health challenges confronting this cohort.Notably, the prevalence of T2DM and MetS exhibited an upsurge with advancing age and increasing BMI, reflecting a compelling association with both the aging demographic and the escalating obesity epidemic.This heightened prevalence carries significant implications for individual well-being and places an onus on healthcare systems, emphasizing the urgent need for preventive strategies and health promotion efforts.
In our study, we displayed a T2DM prevalence of 6% among the 5,011 participants, a figure that closely aligns with the 6th Thai National Health Examination Survey (NHES VI), which reported a 5.7% prevalence of diabetes among Thais aged 30-44 years [30].Moreover, the observed T2DM prevalence in our study is similar to the 3.8% prevalence documented among 2,790 university hospital employees in Bangkok [31] and the 2.7% prevalence reported among 3,360 employees of the EGAT in Nonthaburi, an adjacent province of Bangkok [32].The other striking revelation from this study was that 21% and 1.4% of participants were in unaware pre-DM and T2DM conditions, respectively.These findings indicated a significant association between T2DM and factors such as being overweight, obesity, smoking, and alcohol consumption, consistent with observations in heterogenous populations [33,34].Furthermore, HTN and DLP emerged as common comorbidities in the T2DM cohort, aligning with observations in diverse ethnic populations such as India and Saudi Arabia [35,36].
Our multivariate analysis revealed several pivotal risk factors associated with T2DM.Age, MetS, HTN, and DLP prominently featured in this array.Notably, the strength of association for these predictor variables varied depending on diabetes status.For instance, while being elevated TG levels, HTN, DLP, and smoking were significantly associated with T2DM, they did not manifest such associations with pre-DM.The observed increasing odds of diabetes with age concurs with findings from other studies [37,38], attributed to elevated glycated hemoglobin levels and changes in insulin sensitivity [39,40].Identifying adults with pre-DM assumes paramount importance for initiating early preventive or treatment measures, thereby mitigating the burden of diabetes, and reducing healthcare expenditures.In contrast to previous research indicating that women had a lower diabetes risk than men [41], our study suggests that women exhibited a heightened risk of pre-DM than men, potentially due to a higher proportion of women in the cohort leading to increased detection rates.
While our study aimed to explore the interplay between T2DM and genetic variants of the TCF7L2 gene, the results indicated no substantive evidence to substantiate a significant association between the analyzed variants and an augmented risk of T2DM within this cohort.Nevertheless, we acknowledge that other risk factors may play an influential role in the development of T2DM, thereby underscoring the complexity of the disease and the need for multifaceted investigations.Genotypic distribution analysis showed minimal variability of glycemic control between CC (wild type) and TT (mutant) of rs7917983 and rs290487 across all groups.The mutant genotype was found in all groups, except for rs12255372, rs4132670, and rs12243326 in the T2DM group, an observation that may be ascribed to the inclusion of younger participants in our study.Our findings align with prior studies that found no significant association of the SNPs with T2DM [42,43].Similar results were obtained in studies where the T allele had no impact on the association with T2DM [44,45].In contrast to our study, others concluded that the presence of rs7903146 C/T SNP was associated with an increased T2DM risk, particularly the homozygous TT genotype [46,47].Although no statistically significant difference emerged between the T2DM and non-DM groups regarding genotype or allele frequencies, the calculation of OR indicated that the genotypic distributions of the TCF7L2 rs7903146 (TT) and rs4506565 (TT) polymorphisms carried a risk for pre-DM, with an OR (95% CI) of 2.74 (0.32-23.3) and 2.71 (0.32-23.07), respectively.
The strengths of the SIH study encompass its urbanized, population-based design, the availability of repeated measures and biobank specimen storage, and annual follow-up conducted by healthcare specialists to ensure data precision and the punctual sample collection.SIH also recruits a diverse cross-section of individuals characterized by varying educational backgrounds and occupational profiles.This offers an invaluable resource for conducting workplace surveys to monitor transformations in risk determinants and explore causal relationships through interventional trials within a longitudinal cohort.However, our study has some limitations, which warrant consideration.It focuses on hospital personnel, predominantly women, which may restrict the generalizability of findings to the broader population and urban communities.Moreover, loss to follow-up and missing data could introduce bias.Maintaining the cohort and maximizing participant follow-up necessitate substantial efforts.Additionally, data on certain important potential predictors of diseases, such as associations between specific dietary patterns or eating behaviors and the outcome of interest, were not explored.Future research should delve deeper into the interplay between genetics, lifestyle factors, and T2DM development, with larger sample sizes and more diverse populations to validate findings and uncover potential genetic markers specific to this population.

Conclusions
The SIH study provides valuable insights into the landscape of metabolic risk factors, highlighting concerning prevalence rates of conditions like obesity, abdominal obesity, HTN, T2DM, DLP, and MetS.These findings emphasize significant healthcare challenges, particularly in the context of an aging population and a growing obesity epidemic.The study reveals a 6% prevalence of T2DM, a figure that is in alignment with national and regional epidemiological data.Importantly, a substantial portion of the population remains unaware of their diabetes status, underscoring the need for proactive health promotion.Key contributors to onset of T2DM encompass advancing age, the presence of MetS, HTN, DLP, and the increases in TG concentrations as notable diabetes risk markers.While intended to explore T2DM's genetic basis, the analysis did not yield significant associations with the TCF7L2 gene, highlighting T2DM's multifaceted nature and its etiological underpinnings.Future research should incorporate larger, diverse cohorts to delve deeper into genetics, lifestyle, and T2DM, enhancing findings' validation and revealing population-specific genetic markers.

Table 3 .
(Continued) T2DM was defined as a self-reported medical history of diabetes, current use of hypoglycemic medication, and/or a HbA1c level > 6.4%.DM was defined as a self-reported medical history of no-DM and HbA1c level < 5.7%.< 0.05 for test of null hypothesis that the odds ratio is equal to the odds ratio in the reference category.Sex, age, BMI, HTN, DLP, gout, alcohol consumption, exercise, and smoking were adjusted.OR was odd ratio with 95% confidence intervals (CI).SIH, Siriraj Health; pre-DM, pre-diabetes; T2DM, type 2 diabetes mellitus; BMI, body mass index; WC, waist circumference; BP, blood pressure; MetS, metabolic b c Nond P-value syndrome; TC, total cholesterol; TG, triglycerides; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol; HTN, hypertension; DLP, dyslipidemia https://doi.org/10.1371/journal.pone.0303085.t003 Table in S2 File.

Table 4 .
(Continued)Normal blood glucose levels were defined as fasting plasma glucose concentrations <100 mg/dL.P-value < 0.05 for test of null hypothesis that the relative risk ratios are equal to the relative risk ratios in the reference category.Sex, age, WC, TC, BMI, and BP were adjusted.
c d

Table 5 . Genotypic variants of type 2 diabetes related single nucleotide polymorphisms association comparing pre-diabetes and type 2 diabetes mellitus to those without diabetes (control). SNPs Risk allele % of Genotype Risk factors for Pre-DM a Risk factors for T2DM b
cP-value adjusted for age and sex; OR was an odd ratio with 95% confidence intervals (CI).SNPs, single-nucleotide polymorphisms; pre-DM, pre-diabetes; T2DM, type 2 diabetes mellitus https://doi.org/10.1371/journal.pone.0303085.t005